fix: preserve newlines and show metadata in auto-recall#602
fix: preserve newlines and show metadata in auto-recall#602rwmjhb merged 3 commits intoCortexReach:masterfrom
Conversation
Two changes to improve auto-recall context quality: 1. sanitizeForContext: replace newlines with literal \n instead of collapsing to spaces. Preserves paragraph structure and meaning, especially important for non-Latin scripts (Hebrew, CJK) where line breaks carry semantic weight. 2. Auto-recall line format: show folder, date, and source from entry metadata instead of category:scope. Users store rich metadata via memory-pro import — the recall display should surface it. Before: - [other:global] all text on one line no structure After: - [Goals] 2024-05-30 (apple_notes) text with\npreserved structure Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
AliceLJY
left a comment
There was a problem hiding this comment.
动机完全合理——Apple Notes 这类富元数据导入场景下,[other:global] 这种笼统 prefix 确实丢失了可追溯性,你的改进方向对。
但当前实现有一个 blocking 问题:
🔴 Prefix 改动把 canonical 信息全删了
- prefix: `${tierPrefix}[${displayCategory}:${r.entry.scope}]`,
+ prefix: (() => { const f = metaObj.folder ? `[${metaObj.folder}]` : "";
+ const s = metaObj.source ? `(${metaObj.source})` : "";
+ const d = r.entry.timestamp ? new Date(r.entry.timestamp).toISOString().slice(0, 10) : "";
+ return `${f} ${d} ${s}`.trim(); })(),新逻辑完全移除了三个关键信号:
tierPrefix(L0/L1/L2 tier marker)—— mlp 的核心 feature,让 agent 知道 memory 的可信度分层。删掉后 L0 和 L2 长一样。displayCategory(goal / preference / task / fact 等)—— agent 识别"这条记忆是什么类型"的主信号。scope(global / team:xxx / user:xxx)—— 作用域信息,权限语境的关键。
如果 entry 没有 metadata.folder、metadata.source、timestamp——也就是绝大多数非 import 来源的 memory——prefix 会变成空字符串 ""。对通用用户来说这是严重 regression。
📐 建议改为"追加而非替换"
保留 canonical prefix 作为底,metadata 作为额外维度追加:
prefix: (() => {
const base = `${tierPrefix}[${displayCategory}:${r.entry.scope}]`;
const parts: string[] = [base];
if (metaObj.folder) parts.push(`[${metaObj.folder}]`);
if (r.entry.timestamp) parts.push(new Date(r.entry.timestamp).toISOString().slice(0, 10));
if (metaObj.source) parts.push(`(${metaObj.source})`);
return parts.join(" ");
})(),这样:
- 无 metadata 的用户:看到原格式
[goals:global] - 有 metadata 的 Apple Notes import:看到
[other:global] [Goals] 2024-05-30 (apple_notes)—— 两者都保留
🟡 另一改动(\\n 替代空格)我认同
sanitizeForContext 里把 [\r\n]+ 替换成字面 \n 字符(不是真换行符)是个聪明选择:
- 既给 LLM 传递了"这里有段落分隔"的信号
- 又不会破坏 recall line 的单行格式
这部分可以保留。
🧪 测试
PR body 里的 test plan 还是 [ ] 状态(空 checkbox),能补一下:
- recall 一条 entry with metadata,断言 prefix 包含 folder/date/source
- recall 一条 entry without metadata,断言 prefix 保留
[category:scope]原格式 - 断言
tierPrefix在 L1/L2 tier 下都正确显示
改完后我 re-review。谢谢。
…recall prefix - Check r.entry.category === "other" (raw stored value) instead of displayCategory, since parseSmartMetadata always enriches "other" to a semantic category via reverseMapLegacyCategory — making displayCategory === "other" unreachable. - Retain tierPrefix and scope in prefix (restore what PR originally removed). - Append date and source suffix only when available. - Apple Notes import with folder "Goals" now renders as [Goals:global] instead of [other:global] or [patterns:global]. - Entries without folder metadata are unaffected — canonical prefix preserved. - Add 3 tests covering: folder override for "other" entries, no override for non-"other" entries, and tier prefix presence for entries with tier metadata. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds an optional recallPrefix.categoryField plugin config that lets users
specify which raw metadata field to use as the category label in auto-recall
prefix lines, instead of the built-in category.
When set, the value of metadata[categoryField] replaces the built-in category
in the [category:scope] prefix — falling back to displayCategory when the
field is absent on an entry.
This makes it easy to surface meaningful grouping labels from import-based
workflows (e.g. Apple Notes folder names, Notion notebooks, Obsidian
collections) without hardcoding any field names in core logic.
Default behavior (categoryField unset) is unchanged — built-in category
is used for all entries, so existing users see no difference.
Example config:
recallPrefix: { categoryField: "folder" }
// entry with metadata.folder = "Goals" → prefix: [W][Goals:global]
// entry without metadata.folder → prefix: [W][preferences:global]
Adds 3 tests covering: field present, field absent (fallback), and
no config (default behavior unchanged).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Hi @AliceLJY — thanks for the thorough review. You're right that removing What changed The prefix format is fully restored: The Apple Notes metadata display is now handled via a new optional config field ```typescript // result Why this approach
Tests Added 3 tests covering: field present → uses field value, field absent → falls back to built-in category, no config → built-in category unchanged. All 22 tests in the file pass. No regressions in the full suite (4 pre-existing failures unrelated to this PR). |
Summary
Two changes to improve auto-recall context injection quality:
Preserve newlines in
sanitizeForContext— replace\r\nwith literal\ninstead of collapsing to a space. Line breaks carry semantic weight, especially in non-Latin scripts (Hebrew, CJK, Arabic) where a newline separates distinct thoughts. Collapsing to space merges them into an unreadable run-on.Show entry metadata in recall line format — display
folder,date, andsourcefrom entry metadata instead of the generic[category:scope]. Users who import memories viamemory-pro importwith rich metadata (folder organization, source tracking, timestamps) lose all of that context in the current display format. The agent can't tell where a recalled memory came from.Before:
After:
Context
Discovered while building an Apple Notes → memory-pro import pipeline. 1,800+ personal notes imported with folder, source, author, and date metadata. Auto-recall surfaced them with no attribution and destroyed the original formatting.
Test plan
\nin recalled text🤖 Generated with Claude Code